Mining Frequent Sequential Patterns under Regular Expressions: A Highly Adaptive Strategy for Pushing Contraints

نویسندگان

  • Hunor Albert-Lorincz
  • Jean-François Boulicaut
چکیده

This paper introduces a new framework for the extraction of frequent sequences satisfying a given regular expression (RE) constraint. Contrary to previous work (SPIRIT algorithms), we represent REs by tree structures and our algorithm can choose dynamically an extraction method according to the local selectivity of the sub-REs. Interestingly, pruning can rely not only on the anti-monotonic minimal frequency constraint but also to the RE constraint that is generally not anti-monotonic. Preliminary experiments on synthetic data have shown that our algorithm takes the shape of the best algorithm from the SPIRIT family and even surpasses it.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining frequent sequential patterns under regular expressions: a highly adaptative strategy for pushing constraints∗

This paper introduces a new framework for the extraction of frequent sequences satisfying a given regular expression (RE) constraint. Contrary to previous work (SPIRIT algorithms), we represent REs by tree structures and our algorithm can choose dynamically an extraction method according to the local selectivity of the sub-REs. Interestingly, pruning can rely not only on the anti-monotonic mini...

متن کامل

First-Order Temporal Pattern Mining with Regular Expression Constraints

Previous studies on mining sequential patterns have focused on temporal patterns specified by some form of propositional temporal logic. However, there are some interesting sequential patterns, such as the multi-sequential patterns, whose specification needs a more expressive formalism, the first-order temporal logic. In this paper, we extend a well-known user-controlled tool, based on regular ...

متن کامل

SPIRIT: Sequential Pattern Mining with Regular Expression Constraints

Discovering sequential patterns is an important problem in data mining with a host of application domains including medicine, telecommunications, and the World Wide Web. Conventional mining systems provide users with only a very restricted mechanism (based on minimum support) for specifying patterns of interest. In this paper, we propose the use of Regular Expressions (REs) as a flexible constr...

متن کامل

Mining Sequential Patterns with Regular Expression Constraints

ÐDiscovering sequential patterns is an important problem in data mining with a host of application domains including medicine, telecommunications, and the World Wide Web. Conventional sequential pattern mining systems provide users with only a very restricted mechanism (based on minimum support) for specifying patterns of interest. As a consequence, the pattern mining process is typically chara...

متن کامل

Temporal Support of Regular Expressions in Sequential Pattern Mining

Classic algorithms for sequential pattern discovery, return all frequent sequences present in a database. Since, in general, only a few ones are interesting from a user’s point of view, languages based on regular expressions (RE) have been proposed to restrict frequent sequences to the ones that satisfy user-specified constraints. Although the support of a sequence is computed as the number of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003